Multiple factor analysis: principal component analysis for multitable and multiblock data sets
نویسندگان
چکیده
Multiple factor analysis (MFA, also called multiple factorial analysis) is an extension of principal component analysis (PCA) tailored to handle multiple data tables that measure sets of variables collected on the same observations, or, alternatively, (in dual-MFA) multiple data tables where the same variables are measured on different sets of observations. MFA proceeds in two steps: First it computes a PCA of each data table and ‘normalizes’ each data table by dividing all its elements by the first singular value obtained from its PCA. Second, all the normalized data tables are aggregated into a grand data table that is analyzed via a (non-normalized) PCA that gives a set of factor scores for the observations and loadings for the variables. In addition, MFA provides for each data table a set of partial factor scores for the observations that reflects the specific ‘view-point’ of this data table. Interestingly, the common factor scores could be obtained by replacing the original normalized data tables by the normalized factor scores obtained from the PCA of each of these tables. In this article, we present MFA, review recent extensions, and illustrate it with a detailed example. © 2013 Wiley Periodicals, Inc.
منابع مشابه
How to perform multiblock component analysis in practice.
To explore structural differences and similarities in multivariate multiblock data (e.g., a number of variables have been measured for different groups of subjects, where the data for each group constitute a different data block), researchers have a variety of multiblock component analysis and factor analysis strategies at their disposal. In this article, we focus on three types of multiblock c...
متن کاملBatch Process Monitoring Using Multiblock Multiway Principal Component Analysis
Batch process monitoring to detect the existence and magnitude of changes that cause a deviation from the normal operation has gained considerable attention in the last decade. There are some batch processes that occur as a single step, whereas many others include multiple phases due to operational or phenomenological regimes or multiple stages where different processing units are employed. Hav...
متن کاملAn ExPosition of multivariate analysis with the singular value decomposition in R
ExPosition is a new comprehensive R package providing crisp graphics and implementing multivariate analysis methods based on the singular value decomposition (svd). The core techniques implemented in ExPosition are: principal components analysis, (metric) multidimensional scaling, correspondence analysis, and several of their recent extensions such as barycentric discriminant analyses (e.g., di...
متن کامل1H NMR, GC-EI-TOFMS, and data set correlation for fruit metabolomics: application to spatial metabolite analysis in melon.
A metabolomics approach combining (1)H NMR and gas chromatography-electrospray ionization time-of-flight mass spectrometry (GC-EI-TOFMS) profiling was employed to characterize melon (Cucumis melo L.) fruit. In a first step, quantitative (1)H NMR of polar extracts and principal component analyses (PCA) of the corresponding data highlighted the major metabolites in fruit flesh, including sugars, ...
متن کاملSparse principal component analysis for multiblock data and its extension to sparse multiple correspondence analysis
Two new methods to select groups of variables have been developed for multiblock data: ”Group Sparse Principal Component Analysis” (GSPCA) for continuous variables and ”Sparse Multiple Correspondence Analysis” (SMCA) for categorical variables. GSPCA is a compromise between Sparse PCA method of Zou, Hastie and Tibshirani and the method ”group Lasso” of Yuan and Lin. PCA is formulated as a regres...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013